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ABSTRACT 

Using numerical simulations of structure formation, we investigate multiple methods of determining the 
strength of the proximity effect in the Hi Lya forest. We analyze three high resolution (^10 kpc) redshift 
snapshots (z = 4, 3 and 2.25) of a Hydro-Particle-Mesh simulation to obtain realistic absorption spectra of 
the H I Lya forest. We model the proximity effect along the simulated sight lines with a simple analytical 
prescription based on the assumed quasar luminosity and the intensity of the cosmic UV background. We 
begin our analysis investigating the intrinsic biases thought to arise in the widely adopted standard technique 
of combining multiple lines of sight when searching for the proximity effect. We confirm the existence of this 
biases, albeit smaller than previously predicted with simple Monte Carlo simulations. We then concentrate 
on the analysis of the proximity effect along individual lines of sight. After determining its strength with a 
fiducial value of the UV background intensity, we construct the proximity effect strength distribution (PESD). 
We confirm that the PESD inferred from the simple averaging technique accurately recovers the input strength 
of the proximity effect at all redshifts. Moreover, the PESD closely follows the behaviors found in observed 
samples of quasar spectra. However, the PESD obtained from our new simulated sight lines presents some 
differences to that of simple Monte Carlo simulations. At all redshifts, we identify in the smaller dispersion of 
the strength parameters, the source of the corresponding smaller biases found when combining multiple lines 
of sight. After developing three new theoretical methods of recovering the strength of the proximity effect on 
individual lines of sight, we compare their accuracy to the PESD from the simple averaging technique. All our 
new approaches are based on the maximization of the likelihood function, albeit invoking some modifications. 
The new techniques presented here, in spite of their complexity, fail to recover the input proximity effect in an 
un-biased way, presumably due to some (unknown) higher order correlations in the spectrum. Thus, employing 
complex 3D simulations, we provide strong evidence in favor of the proximity effect strength distribution 
obtained from the simple averaging technique, as method of estimating the UV background intensity, free of 
any intrinsic biases. 

Subject headings: diffuse radiation - intergalactic medium - quasars: absorption lines 



1. INTRODUCTION 

The transition from a neutral to an ionized state of the bary- 
onic matter in the Universe, known as the epoch of reion- 
ization, also resulted in the appearance of the cosmic ultra- 
violet background radiation field (UVB). While it is still de- 
bated whether more exotic objects and processes (like mini- 
quasars or dark matter annihilation) had significant influ- 
ence on the process of reionization ( Haiman & Loebl [19981; 
iRicotti & Ostrikeii 120041) . it is widely accepted that young 
star-forming galaxies and quasars are the primary sources 
of this radiation field in the post-reionization era (z < 6). 
Thus, after reionization, any change in the properties of the 
source population is refl ected in the evolution of the UV 
background (UV B, Haard t & Madau 1996; Fardal et al. 1998; 
iHaardt & Madaui,2001.) . Accurate estimates of the UVB in- 
tensity at different redshifts therefore provide important con- 
straints on the evolution of star-forming galaxies and quasars 
in the Universe. 

The most direct probe for the UVB is the ionization state of 
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the intergalactic medium (IGM). Mainly consisting of hydro- 
gen and helium, the IGM becomes detectable as the light from 
high redshift (z > 2) quasars travels toward us through the 
intergalactic space. Numerous absorption lines observed at 
wavelength shorter than the rest frame Lya transition, known 
as the Lya forest, arise from the small fra ction of neutral mat- 
ter (about 1 part in 100,000) in the IGM (ISargent et al.llT980t 
IWevmann e t al. 1981; Rauch 1998). The UVB is directly re- 
sponsible for keeping the IGM ionized at this level, thus en- 
coding its intensity (and, to a lesser extent, its spectrum) in 
the absorption profiles of the Lya forest. 

Observationally, the only technique known so far to directly 
infer the photoionization rate or, equivalently, the UVB inten- 
sity over some a range of wavelengths is based on the so- 
called proximity effect. This effect is the manifestation of the 
IGM response to a systematic enhancement of UV radiation 
around bright quasars. 

In the vicinity of a bright quasar, its UV radiation be- 
comes several orders of magnitudes stronger than the cos- 
mic UVB, leading to the decrease d absorption blueward 
of the quasar Lya emission line (Wevm ann et al.l 119811: 
ICarswell et al] 119821: [Murdoch et al. 1986). If the quasar 
luminosity is known, and the relative enhancement in the 
UV flux near the quasar relative to the average Universe is 
measured from the Lya absorption spectra, the strength of 
the cosmic UVB can be deduced from the proximity effect 
(.Carsw ell et al. 1987; BajtHk et al. 1988). While the proxim- 
ity effect has been detected for more than a decade, primarily 
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Table 1 

Input parameters of the HPM simulation. 



Parameter 


Value 


Parameter 


Value 




0.237 


Np 


1024' 




0.763 


Mesh 


1024^"^ 




0.041 


Cell size 


0.01* 


h 


0.735t 


Box size 


10.24* 




0.742 


z 


4.0, 3.5, 3.25, 








2.75, 2.5, 2.25 



f: in units of 100 km s ' Mpc ' 
p. in units of Mpc 

in large samples of quasars (e.g. 'Baitlik et al."1988'; 'Lu et alj 
p91; Giallongo et al. 1996; Cooke et al. 1997; Scott et al.! 
|2()00; Liske & Williger 2001), recent investigations of its sig- 
nature along individual lines of sight have been employed to 
develop a new techn ique for estimating the UVB intensity 
dPair Agho et alj|2008b.a) . 

This new approach is based on the analysis of the proxim- 
ity effect strength distribution (PESD). Two distinct features 
appear in the analysis of the PESD. First, the strength distribu- 
tion shows a clear peak and, second, it is significantly asym- 
metric. The peak of the PESD directly relates to the intensity 
of the UVB, whereas its asymmetry is mainly the result of 
low number statistics in the absorber counts near the quasar 
emission (Dall'Aglio et al. 2008a, hereafter Paper II). 

This approach is nevertheless subject to a large dispersion, 
as it is based on the detection of the proximity effect along 
individual sight lines. Such a dispersion is inversely related 
to the change in the opacity in the Lya forest, and it is fur- 
ther amplified by effects like overdensities or quasar variabil- 
ity which are poorly understood. We are therefore motivated 
to initiate a theoretical investigation on the methodological 
approach of estimating the strength of the proximity effect. 

The plan of the paper is as follows. We begin with a descrip- 
tion of the type of simulations employed in Sect. |2l We then 
describe in detail in Section[3]the computation and calibration 
of the synthetic sight lines generated through the simulation 
box. Section |4] introduces the theoretical approach adopted 
to include the proximity effect on the lines of sight. We re- 
port in Sect|5]our results for different approaches in estimat- 
ing the proximity effect signature on individual objects. We 
then present our conclusions in Sect.|6] 

2. SIMULATIONS 

In order to simulate moderate volumes of the Universe 
at high accuracy but with limited computational resources, 
we use t he Hydro-Particle-Mesh (HPM) code developed by 
iGnedin & Hull (il998l) . This particular class of numerical 
codes differs from those following only the dark matter, in its 
capability of modeling both the dark matter and the baryonic 
components of the Universe. However, an HPM simulation 
is not as computationally expensive as a full hydrodynamical 
one. 

The IGM consists of the low density cosmic gas between 
collapsed objects. In this low density regime there exists a 
tight correlation between the gas density and temperature in 
the form 

T = To{l + dr-\ (!) 

where 6 is the baryonic density contrast, Tq is the tempera- 
ture at the mean density, which is of the order of IO^'K and 7 
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Figure 1. Adopted evolution of To and 7 with redshift, in com parison with 
measurements of the equation of state from lRicotti et alj 120001) (triangles). 



ranges between 1 and 1.6. For this reason, the thermal history 
of the low density component of the IGM can be described 
with high accuracy by the evolution of the two parameters To 
and 7. Both parameters are functions of time and are sen- 
sitive to the ionization history of the Universe. Equation [1] 
also known as the effective equation of state, immediately pro- 
vides the thermal pressure of the gas as a function of density, 
thus removing the need for a full hydrodynamical solver in 
the code (Hui et al. 1997; Gnedin & Hui 1998). 

The thermal evolution of the IGM after reionization is 
mainly determined by the balance between adiabatic cool- 
ing (expansion of the Universe) and photoionization heating 
of cosmic gas. Additional effects that influence the effective 
equati on of state include Compt on heating from X-ray sources 
(e.g. Madau & Efstathioulll999) and radiative transfer effects 
during He II reionization (e.g. Maselli & Ferrara 2005'). In 
this work we adopt an empirical approach, and use observa- 
tional constraints on the effective equation of state to ensure 
that the thermal state of the Lya forest in our models is real- 
istic. 

Observational constraints on the param eters Tq and 7 come 
from analyses o f Lya ab sorption l ines (Ri cotti et al.l 120001 ; 
iSchave et al]2000; McDonald et al.l200 1a). from the Doppler 
parameter distribution as a function of column density. The 
lower cut-off of the b-N distribution can be fitted by a power 
law b = bN„{N / N())^ , in which the proportionality constant b^^ 
and the power law index (3 directly relate to To and 7, respec- 
tively. 

The effective equation of state in the simulation was set in a 
piece-wise manner in three different inter vals. At z < 4.5 we 
used the observed evolution of To and ^ dRicotti et alj|2000l ; 
ISchave et aUlIOOOl: IMcDonald et allllOOT ar Between z = 6.5 
and z = 4.5 we used the effective equat ion of state from reion- 
ization simulations of IGnedin & FanI ([2006); these simula- 
tions match well the observed Lya opacity in the spectra of 
high redshift quasars discovered in the Sloan Digital Sky Sur- 
vey (SDSS) and smoothly merge with the observational con- 
straints on To and 7 at z w 4.5. Finally, during the reioniza- 
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Figure 2. Example of a sight line drawn through the simulation box at red- 
shift z = 3. Top panel: baryonic overdensity as a function of position along the 
line of sight. The distance scale corresponds to the x coordinate of Equation|2] 
(see text for more details). Bottom panel: the inferred hydrogen transmitted 
flux as a function of wavelength. 



tion era (z > 6.5) Tq and 7 were assumed to increase linearly 
with the scale factor This assumption is somewhat uncertain, 
but it is approximately consistent with high resolu tion numer- 
ical s imulations of reionization (Gnedin 2004 ; Gnedin & FanI 
l2006h and has a negligible effect on the thermal state of the 
Lya forest at our redshifts of interest, 2 < z < 4. Figure [T| 
shows the parameterized values of To and 7 as a function of 
cosmic time up to the final redshift used in the simulation. 

For the purpose of this work we are not interested in an 
accurate calibration of the effective equation of state with 
all observational constraints, simply because these parame- 
ters are poorly estimated yielding a large s catter of results 
(iMcDonald et all l2001bt ISchaye et all l2000h . The relevant 
fact is that Eq.[T| defines the underlying equation of state and 
that To and 7 do evolve with redshift according to a specific 
ionization history. 

Following the results of the Wilkins on Microwav e 
Anisoti-opy Probe three yeai's data (WMAP3, ISperge]||2006 l). 
Tab [1] lists the parameters adopted to generate the simulations 
discussed in this work. Here il„, is the total matter density pa- 
rameter, r^A is the cosmological constant and il/, is the baryon 
density parameter. The Hubble constant is li expressed in 
units of 100 km s"' Mpc"' and erg represents the rms den- 
sity fluctuation on 8 Mpc scales at z = 0. We fixed the box 
size to 10.24 Mpc with Np = 1024^ particles on a 1024^ 
mesh. This yields a resolution element of 10 kpc ensuring 
an accuracy on a few km s"' scal e in t he generation of the ar- 
tificial quasar spectra (see section lTTl l. We recorded the state 
of the simulation of seven different redshifts denoted by z. 

3. THE LYMAN FOREST 

3.1. Computation of the H I absorption 

The final product of an HPM simulation consists of a cos- 
mological box (one at each z), containing information about 
the hydrogen density contrast 6b and the relative spatial ve- 



locity {v_t,Vy,v.). We use this information to compute a set of 
absorption spectra as follows. We draw a set of 500 randomly 
distributed sight lines through the box obtaining along each 
line of sight a spatial coordinate plus velocity and density in- 
formation. In order to compute the absorpti on spec trum of the 
Lya forest, we follow the methodology of iHui et al. (1993), 
which we briefly summarize here. 

The optical depth of the Lya forest at the observed wave- 
length Aq is given by 



f'" dx 
r(Ao)= / hhiO-q 



(2) 



with X being the comoving radial coordinate along the line of 
sight, z is the redshift and «hi is the neutral hydrogen density 
at location x. The Lya absorption cross section is ctq. 

If we expand the redshift scale around the mean redshift of 
interest z (in our case the snapshot redshift of our simulation), 
we can introduce a new coordinate u defined as 



H 

U = z(x-x) + Vr,ec(x) 

l+Z 



(3) 



where x is the position at which the redshift due to cosmo- 
logical expansion is equal to the snapshot redshift z- For sim- 
plicity we assume that the line of sight starts at the snapshot 
redshift, thus at x = 0. 

It is convenient to substitute the observed wavelength Ao 
with a new velocity coordinate mq, which is related to Aq by 



Ao = Aa (1+z) 



(4) 



where Aq = 1215. 67A. In this notation the optical depth be- 
comes 



, , ^ r «Hi 

r(Mo) = > / 7— CTo 



du. 



where 



(Tq = (Ja.o- — ^exp (-(u-uof/b^) . 
b ^/n ^ 



(5) 



(6) 



The limits of integration ua and ug correspond to the velocity 
values of the positions xa and xb- The value of ctq.o depends 
only on fundamental constants and is approximately 4.5 x 
10"'^ cm^. The Doppler parameter b is equal to ^JlksT /nip, 
where kg is the Boltzmann constant, T is the gas tempera- 
ture at the velocity u, and nip is the proton mass. In order 
to compute the gas temperature at a given velocity and for a 
particular snapshot, we used our equation of state parameter- 
ization (To, 7),. The sum in the integral accounts for velocity 
caustics, where one value of u corresponds to more than one 

X. 

The final step in the computation of an absorption spectrum 
consists of deriving the neutral hydrogen fraction Xy^ j from the 
baryonic overdensity 5h estimated with our HPM code. The 
neutral fraction is determined by the balance between pho- 
toionization and recombination, and it depends both on the 
temperature T and the intensity of the UV background /hi- 
The temperature typically is a function of the position and is 
determined by the effective equation of state, while the inten- 
sity of the UVB is, in our case, a free parameter. An illustra- 
tive example of the result of our procedure is shown in Fig.|2l 

Finally, the absorption spectrum should match two obser- 
vational constraints; (i) the evolution of the effective optical 
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Figure 3. Effective optical depth evolution in our simulated sight lines in 
comparison with observations. The solid circles are the average values of 
Teff in our synthetic spectra with the relative dispersions. Triangles and 
sguares represent the measurements performed by Schaye et al. (2003) and 
iDa ll'Aglio et al. (2008a) respectively, employing different samples of high 
resolution quasar spectra. The solid line represents the best-fit solution of 
Eq.|2]recently estimated by DaH'Aglio et al. (2008a). 

depth in the Lya forest and (ii) the flux probability distribu- 
tion function. To accurately calibrate our simulation, we em- 
ployed the sample of 40 high resolution (R ^ 45 000), high 
S/N (S/N ^ 70) quasar spectra obtained with the UV- Visual 
Echelle Spectrograph (UVES), probing a redshift interval be- 
tween z ^ 1 .8 and z ~ 4.6 (Paper II). Our simulated spectra are 
computed with the same spectral resolution as the observed 
sample, and in a similar redshift range, thus the two data sets 
can be directly compared. 

The calibration of the simulated absorption spectra has been 
carried out iteratively. As the main goal of this work is to test 
and compare different methods of estimating the proximity 
effect signature, we do attempt to match exactly the synthetic 
and observed spectra. Rather, we adjust the intensity of the 
UV background to obtain an acceptable (but not necessarily 
the best) match between the simulated sight lines and the ob- 
served flux probability distribution and the evolution of the 
effective optical depth from the UVES observations. The fi- 
nal values of the UV background are listed in Tab. |2] 

A complete study of how well the synthetic spectra can 
match the observed data would require a much more careful 
comparison between the model and the data, including model- 
ing the observational procedure of determining the continuum 
level, thorough sampling of possible temperature-density re- 
lations in the modeled forest, etc. While such effort is well 
worth performing, it is beyond the scope of this paper and we 
postpone it to a future work. 

3.2. The evolution of the effective optical depth 

One fundamental observed property of the Lya forest is a 
steep decline in the hydrogen opacity towards low redshift. 
This behavior is reflected in the so called effective optical 
depth, which is defined as Tgff = - ln{F) = -ln(e"'^"') where 
F is the transmitted flux and the averaging () is performed 
over a fixed redshift path length. The redshift evolution of Tgff 
is well approximated by a power law in the form 

Teff=To(l+z)T^' (7) 

dKim et all 120021: iFaucher-Giguere et all 120081) . where the 
slope 7 has no direct connection to the slope 7 of the equation 
of state (see Eq.[T]i. 



Table 2 

The UV background intensity and the effective optical depth in the 
simulations. 



z 


/Hi/10-2't 


f"cff 


4.00 


0.25 


0.75 ± 0.09 


3.50 


0.30 


0.52 ± 0.08 


3.25 


0.30 


0.39 ± 0.07 


3.00 


0.35 


0.34 ± 0.06 


2.76 


0.40 


0.29 ± 0.06 


2.50 


0.40 


0.25 ± 0.05 


2.25 


0.40 


0.21 ± 0.05 



f : in units of erg cm - s 'Hz ' sr ' 

The main difference between observed spectra of the Lya 
forest and our synthetic realizations is the lack of any evo- 
lution of Teff with redshift along individual sight lines in the 
latter. This is simply because our simulated spectra are drawn 
through a single cosmological box at one particular redshift. 
Thus, for each snapshot, we can estimate the mean Tgff and its 
dispersion starting from a measure of the average transmitted 
flux along each of the 500 simulated lines of sight and nor- 
malizing to the whole redshift interval probed by the observed 
spectra. 

Figure |3] shows our results from the simulated sight lines 
into context while Tab. |2] lists the numerical values. For all 
snapshots at our disposal, the inferred average effective opti- 
cal depth closely follows the expected enhancement at high 
redshift as probed by different inv estigations on high resolu- 
tion quasar spectra (Scha ve et al.l 12003, Paper II). Note that 
the uncertainties on the effective optical depths represent the 
RMS of Teff determined on each single line of sight and not 
the real uncertainties of the measurements. 

3.3. The flux probability distribution 

The steep evolution of the hydrogen opacity in the Lya for- 
est described in the previous section can be detected in quasar 
spectra not only by measuring the average transmitted flux, 
but also by analyzing how the shape of the flux probability 
distribution (FPD) changes with redshift (iJenkins & Ostrikefl 
1I99 I). The FPD provides a strong observational constraint, 
which it is important to be satisfyingly reproduced by a real- 
istic model of the Lya forest. 

We employ similar approaches to compute the FPD in the 
simulated and in the observed spectra. Both (the synthetic and 
simulated) distributions are sampled in bins of AF = 0.01 and 
normalized by the bin size to maintain the condition that the 
FPD integrates to 1 . 

The observed FPD is estimated from the Lya forest of those 
quasars intersecting a redshift slice of Az = 0.2, centered at 
the redshift of the simulated snapshot (z in Tab.[T]). The FPD 
for the synthetic spectra is measured by combining the signal 
for all 500 lines of sight at one particular z- Additionally, we 
add Gaussian noise to the simulated line of sight in order to 
reproduce the average S/N level of the observed spectra. 

FigurelUpresents the comparison between the two estimates 
of the flux probability distribution. The agreement between 
the two distributions is reasonably good even if there are some 
indications of a departure at high redshift, in particular for the 
flux around unity. This lack of agreement is explained by the 
differences in the continuum placement of the observed and 
synthetic spectra. Additionally, we note that the error bars 
of the observed FPD are an estimate of both continuum un- 
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Figure 4. The average flux probability distribution (FPD) estimated from 500 simulated sight lines at three different redshifts (z = 2.25, 3.0 and 4.0, gray 
histogram), in comparison with the observed FPD inferred from a sample of 40 high resolution UVESA'LT quasar spectra (vertical bai's). The uncertainties in the 
simulated FPD are negUgible, while the error bars in the observed FPD account for the variance of absorption between different lines of sight and uncertainties 
in the continuum determination. 
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Figure 5. The differential distribution function of the H I column densities 
estimated from a sample of 300 simulated sight lines at redshift z = 2.25, 3.0 
and 4.0 (100 sight lines per redshift). The solid line represents the least squai'e 
power law fit to the data points. The measured slope of the colum n density 
distribution (/3 = 1.64) is consistent with observations bv lKim et alj f2002). 



certainties and Poissonian variance between different Hnes of 
sight. For the continuum uncertainties we adopted the esti- 
mates presented in Paper II. 

3.4. The column density distribution 

The high resolution of our simulations allows us to further 
characterize the statistical properties of synthetic spectra by 
measuring the distribution of column densities. The differ- 
ential distribution function of the hydrogen column densities 
/(Nhi) is typically defined as the number n of absorption lines 
per u nit column de nsity and per unit absorption path length 
AX^ (lTvtleii[r987l) . This distribution is typically very well 
represented by a single power law of the form /(A^hi) o<^ 
with ranging between 1.4-1.7 (IHu et al.lll99H iKim etalJ 
l2002h . 

Performing a fit of an absorption spectrum is computation- 
ally expensive, therefore we proceeded as follow: (i) we ran- 
domly select 100 simulated sight lines from our full sam- 
ple of 500, (ii) for each selected line of sight we performed 
a Doppler profile fit using the publicly available code AU- 
TOVP , and then (iii) we visually inspected all the lines of 



5AX = (1+z)Az [nm(l+z) + nA(l+z)"-] ' iMisawae t alJ2002l) 
* Developed by R. Dave: http://ursa.as.aiizona.edu/~rad 



sight in order to reject the few cases where the automatic fit- 
ting procedure fails. We repeat this procedure for sight lines 
drawn from the snapshot at z = 4.0, 3.0 and 2.25 and then 
combine the results. 

The estimated column density distribution is plotted in 
Fig.|5] Within the range 12 < logA^Hi ^ 16 cm"^, the distribu- 
tion accurately follows a power law wit h a slope of 0=1 .64, 
close to several observational results (iTvtleii [19871 : iHu et alj 
ll995tlKim et alJ 2002). Our data points seem to deviate from 
a power law extrapolation at the low column density end. This 
effect, discussed in detail by Hu et al. ( 1995), is the result of 
incompleteness in the sample of lines arising primarily from 
line blending and further amplified by noise. 

4. THE PROXIMITY EFFECT 

The prime goal of this work is to test different techniques 
for detecting the proximity effect in quasar spectra. We have 
now a set of simulated sight lines at our disposal accurately 
reproducing many statistical properties of the observed Lya 
forest. We now dicuss how we introduced the proximity effect 
in the simulated spectra. 

In the vicinity of a luminous quasar, the intensity of UV ra- 
diation produced by the quasar itself is typically up to several 
orders of magnitudes larger than the intensity of the UV back- 
ground. This enhanced ionizing radiation acts on the neu- 
tral hydrogen which, after a period of only about lO'* yr after 
the quasar turn-on event, reaches a new state of photoioniza- 
tion equilibrium. In this regime, the neutral hydrogen density 
of the IGM in the absence of the quasar ionizing radiation, 
"hloo, relates to that with the quasar radiation, nui, as 



nHi = 



"Hi 



l+U! 



(8) 



where uj describes the excess of ionizing radiation in the 
vicinity of the quasar in u nits of the average cosmic UV back- 
ground ('Baitlik et al."1988). Analytically, lu can be expressed 
in units of the UVB photoionization rate Fb or in units of its 
intensity at the Lyman limit J^^, 



dL(Za,0) 



Fb 47ry^„ ( 1 + z) V '^i(Zq , z) 



(9) 



where z is the redshift along the line of sight (z < Zq), dL{Zq,0) 
is the luminosity distance of the quasar to the observer, and 
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Figure 6. Example of one simulated line of sight with the signature of the 
proximity effect introduced on the density of neutral hydrogen (dashed line) 
and on the optical depth (solid line). The difference between the two absorp- 
tion patterns are due to the peculiar velocities along the sight line, which are 
neglected whenever the proximity effect is introduced on the optical depth. 
The middle panel shows the ratio between the two spectra. While the in- 
fluence of peculiar velocities leads to a minor discrepancy between the two 
spectra, this difference is drowned in the noise even in high S/N spectra, as is 
shown in the bottom panel (S/N=100). 



diizq^z) is the luminosity distance to redshift z along the line 
of sight. The parameters Fq and quantify the photoioniza- 
tion rate and the Lyman limit flux of the quasar, respectively. 
If we isolate the redshift dependence in Eq. |9]from the con- 
stants, we can define a new, unit-less parameter tj*, which is 
independent of the quasar redshift and is given by 



(47r/?o)V„ 



(10) 



where Rq = 10 Mpc is an arbitrary distance scale introduced 
to make uj^ unit-less (it also appears in Eq. fTTT l. For quasars 
with typical Lyman limit luminosities in the range 30.5 < 
log(L,,„) < 32.5 and a constant UVB intensity = 10"^'-^' 
in units of ergcm"^s"' Hz~' sr"' (Paper II), we obtain 0.07 < 



< L5. With our new definitions, Eq.|9]becomes 
uj(z) = uj* -— — 

\ddZq,Z) 



(11) 



From an observed quasar spectrum, any information about 
the neutral hydrogen density or the velocity field of the gas 
along the line of sight cannot be derived. Therefore, the main 
strategy to recover the influence of the quasar ionization field 
on the Lya forest is to translate the implications of Eq. [8]into 
observables such as the transmitted flux or the effective op- 
tical depth along the line of sight. Assuming that the optical 
depth follows the sam e type of relation a s the n eutral hydro- 
gen density in Eq. [H iLiske & WiUiged (1200 Ih included the 
quasar proximity effect into Teff, 

Teff=ro(l+z)^^'(l+t^)'-'^. (12) 

In the case of our simulated lines of sight, the term expressing 
the evolution of the effective optical depth in the Lya forest, 
To(l -l-z) will be substituted by the average (reff(z)) at each 
snapshot redshift as listed in Tab. |2] In the rest of this paper. 



it will be convenient to use a variable ^ defined as 

=(1+^)'-^ (13) 

< Teff(z) > 

where (3 is the slope of the column density distribution. 

We note that assuming the validity of Eq. [8] also for the 
optical depth along a line of sight implies that the peculiar 
velocity of the hydrogen in the IGM has a negligible impact 
on the absorption spectrum. This assumption is impossible 
to test observationally, but it can be justified with the simu- 
lated spectra. We have the unique possibility of estimating 
this effect for the first time. We thus proceed as follows: (i) 
we compute a set of 100 sight lines at thr ee di fferent redshifts 
(Zq = 2.25, 3 and 4) as described in Sect. 13. II and include the 
proximity effect as a modification of the optical depth along 
the line of sight, or, alternatively, (ii) we included Eq. [8]into 
Eq. |5] meaning that we include the proximity effect on the 
hydrogen density, and then compute the same line of sight as 
in (i). Figure |6] presents the result of such a computation of 
the proximity effect on both the optical depth and the neutral 
hydrogen density. Peculiar velocities lead to a discrepancy 
between the two proximity effect profiles, however this differ- 
ence cannot be detected, since it is dominated by noise even 
in high S/N quasar spectra (S/N ^ 100). 

In the following, the proximity effect is included in the sim- 
ulated spectra as a modification of the neutral hydrogen den- 
sity according to Equations [8] and (TT] We note that the ori- 
gin of all the lines of sight is random, thus the location of 
the quasar (but not its emission redshift) is also random. We 
therefore neglect in the present analysis any effect of a biased 
quasar environment, i.e. overdensities. While our results do 
change quantitatively if we vary uj^, the qualitative outcome 
of our analysis is independent of a particular choice of its nu- 
merical value. Therefore, by default we adopt tj* = cof* = 1, 
unless stated otherwise. 

5. METHODS OF ESTIMATING THE PROXIMITY 
EFFECT STRENGTH 

5.1. Reference approach: the combined proximity effect 

Among all the investigations of quasar spectra aimed at de- 
tecting of the proximity effect, two techniques have been em- 
ployed so far: (i) the line counting statistic and (ii) the flux 
transmission statistic. Both adopt the common principle of es- 
timating a certain quantity (number of lines or average trans- 
mission) within a regularly spaced grid in a sample of quasars. 
As we already showed in Paper II the advantages of the flux 
transmission with respect to the line counting statistics, we 
will use only the flux transmission statistic as our reference 
technique. 

For each of the simulated spectra, and given the "input" 
value u!^, we construct the uj scale according to Eq. [TTI and 
then define a uniform grid in logcj space. In each of the grid 
elements we determine the average flux and, thus, the effec- 
tive optical depth values considering all spectra simultane- 
ously. Finally, following Eq. [13] we derive the corresponding 
values of ^ as a function of uj. The typical proximity effect 
signature is such that ^ ^ for ^ oo and it can be analyti- 
cally modeled according to the formula 



F(l0): 



(-3 



1-/3 



(14) 



where the slope of the column density distribution was fixed 
to /? = 1.64 at all redshifts according to our measurements 
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(Sect. I3.4l i. and a is a single fitting parameter The best-fit 
value of a can then be used to compute the "measured" value 
of the proximity effect strength lo^^"^ = a w™. Ideally, this 
value should be close to the input value lj™ (i.e. a should be 
close to 1). 

This technique has been employed in the majority of the 
proximity effect investigation aiming at a constraint of the 
cosmic UV background intensity at the Lyman limit since 
cx J~l. In Paper II we first showed that this com- 
bined method is characterized by an intrinsic bias. Employing 
Monte Carlo simulations, we presented evidence for this bias 
by comparing the input and output proximity effect signal in 
a set of 500 synthetic spectra. 

While a Monte Carlo approach may be sufficient when ef- 
ficiently simulating the "randomness" in the properties of the 
absorbers, the new sight lines presented here are a significant 
step forward in terms of accurately reproducing the statistical 
properties of the Lya forest. We begin our investigation com- 
paring the results on the combined analysis of the proximity 
effect on both the Monte Carlo and the numerical simulated 
lines of sight. The Monte Carlo simulated spectra have been 
computed using the same procedure as in Paper II. In all cases 
we employed the signal of 500 spectra including the proxim- 
ity effect in the same way as described in Sect. |4] 

We fitted Eq. [14] to the values of ^ determined from 
a combination of all sight lines. Repeating this ex- 
ercise at z= (2.25, 3.0, 4.0) we obtained for the Monte 
Carlo simulations an overestimation in a;°^^ equal to 
Alogfl = (0.14, 0.1, 0.05) dex, respectively, while for the 
HPM simulations we obtained A log a = (0.1, 0.01 ,0.01) dex. 
This, on the one hand confirms the existence of the bias, but 
on the other hand shows that the Monte Carlo simulations tend 
to overestimate it. In particular, at z = 3.0 the HPM simu- 
lated sight lines predict an almost negligible overestimation. 
We suspect that the origin of this marginal disagreement may 
be primarily attribute to the procedure that generates Monte 
Carlo absorption spectra. The algorithm does not place a fixed 
number of absorption lines, instead continues to populate the 
spectrum with as many line as necessary to yield an evolution 
of Teff consistent with a pre-fixed power law. This may then 
translate into a larger scatter of absorption very close to the 
emission redshift, thus enhancing the systematic bias when 
combining multiple sight lines. We also cannot rule out the 
possibility that the calibration of our new synthetic spectra 
against observations has an effect in reducing the bias of the 
combined analysis of the proximity effect. 

5.2. The proximity effect strength distribution 

A correct understanding of the biases involved in the com- 
bined analysis of the proximity effect is essential to accurately 
determine the cosmic UV background intensity. We proposed 
in Paper II a new technique of measuring the UVB intensity, 
unaffected by the biases described in the previous section. 
This approach is based on the determination of the proxim- 
ity effect along individual lines of sight in a quasar sample. 
Always adopting Monte Carlo simulated lines of sight at dif- 
ferent redshifts, they fitted Eq. [14] to individual spectra and 
showed that 

1. the distribution of log cj°^^/cj™ = logo is skewed 

2. the skewness increases with decreasing redshift 

3. this asymmetry is the main contributor to the overesti- 
mation of the UVB found in the Uterature 
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Figure 7. The proximity effect signatures in one simulated line of siglit. The 
data points show the normalized effective optical depth ^ versus ui, binned in 
steps of A log LJ = I. The dotted line represents the reference model used to 
introduce the proximity effect in the sy nthe tic spectrum. The solid line shows 
the best fit model as described in Sect. l5.ll 



4. the peak of this distribution is an unbiased estimate of 
the UV background intensity 

The skewness of the proximity effect strength distribution 
(PESD) originates from the definition of the uniform grid 
in logw space. In other words, as a constant loguj range 
progressively probes smaller redshift intervals approaching 
the quasar, the absorbers tend to no longer be Gaussian dis- 
tributed. This effect is further enhanced at lower emission red- 
shifts since the line number density decreases. Therefore, the 
distribution not only becomes broader, but also more skewed. 

To check how accurately we can recover the input value 
uj^, we fit Eq. [14] to all 500 lines of sight at three different 
redshifts (zq = 2.25, 3 and 4). That gives us an estimate of the 
proximity effect strength w^^^ along each sight line. Figure|7] 
illustrates a typical example of the proximity effect signature 
along one sight line in our HPM simulations. All lines of sight 
can then be combined to form the proximity effect strength 
distribution. Figure [8]presents our results. 

We confirm, with advanced 3D numerical simulations, the 
recent results reported in Paper II: the PESD sharply peaks 
at the input model (loguj^^'^ / ujf* = 0) and becomes broader 
towards lower redshift. Furthermore, the skewness in the 
PESD increases towards low redshift. However, our re- 
sults on the PESD inferred from the HPM simulation quan- 
titatively differ from the Monte Carlo simulations. While 
the peaks of the distributions match, the rms are signifi- 
cantly smaller for the HPM-based sight lines. We obtained 
at redshifts z = (2.25, 3.0, 4.0) a dispersion of strength pa- 
rameter equal to aloga = (0.3, 0.23, 0.1) dex for the HPM 
simulations, while in the Monte Carlo one we estimated 
crlogfl = (0.65, 0.5, 0.2) dex. The larger dispersion in the lat- 
ter results in the stronger bias in the combined proximity ef- 
fect analysis reported in the previous section. 

To precisely estimate the uncertainties related to the modal 
value of the PESD we adopted a bootstrap technique. Starting 
from a distribution of A^, values of log a, where A^, represents 
the total number of logo estimates, we randomly duplicated 
Ni/e strength parameters and estimated the modal value of 
the new PESD. We repeated this process 500 times for each 
redshift snapshot (as well as in the following), obtaining the 
mean and the sigma values of PESD modes. 



8 



Dall'Aglio et al. 




LogCwpuV"!") Log(cj?"V"™) Log(cj?"Vw™) 

Figure 8. The proximity effect strength distribution (PESD) in three different sets of 500 sight lines drawn from our HPM simulation boxes at redshift ; = 2.25, 3.0 
and 4.0 (thick histogram). The thin histogram represents the PESD obtained from a sample of 5 00 Monte Carlo simulated lines of sight. For both types of 
simulations we determined the proximity effect strength adopting the best-fit log a value of Eq. [14] The vertical dashed line marks the reference model used for 
creating synthetic Lya forest spectra. 
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Figure 9. An example of the likelihood function C estimated from two dif- 
ferent sight lines in our simulations. While in one sight line (dotted dashed 
profile) the likelihood is maximized at the input model, the second sight line 
(dotted profile) has the most likely value of u)^^'^ significantly below lj™ . 



Measuring the proximity effect signal along individual lines 
of sight, and thus determining the PESD, allows unbiased es- 
timates of the cosmic UV background. However, this method 
is still based on a simple averaging process of the absorption 
in the Lya forest. In other words, the advantage of dealing 
with very high quality data is not fully explored. Hereafter, 
we will refer to the PESD estimated from the normalized op- 
tical depth on individual lines of sight as the simple averaging 
technique. 

5.3. The maximum-likelihood approach 

The importance of a precise determination of the UVB in- 
tensity at different epochs motivates us to further develop and 
test new methods of determining the proximity effect strength. 

A widely used, extremely flexible approach for recovering 
input parameters is the maximization of the likelihood func- 
tion (LF). It expresses the probability of a set of parameters 
in a statistical model describing certain data. In our case, we 
can write this function as the probability that our spectrum has 
been modified by the quasar radiation of a given strength LOi,. 

Generally the likelihood function is defined as 



C = \{P{Fi\C) 



(15) 



/=i 



where the product is calculated over data points, and 
P{Fi\C) is the probability of occurrence of the measurement 
Fi given the set of parameters C. Here, all data points are 
flux values in the observed or synthetic Lya spectrum. 



The prime limitation of Eq. [T5]is that the product operator 
must be applied to uncorrected data points. However, neigh- 
boring pixels in the Lya absorption spectrum are strongly cor- 
related due to both physical correlations of cosmic large-scale 
structures and thermal and instrumental broadening of absorp- 
tion lines. Eq.[T5]can be generalized for the case of correlated 
data, but that would require knowing an N- point correlation 
function for the flux, which is impossible to estimate in any 
reasonable way neither from the observational data nor from 
the simulations. 

Therefore, as a first attempt, we adopt a simplified approach 
and re-bin the spectra over at least 40 km s~' (the average 
width of an absorber in the velocity space) in order to signif- 
icantly reduce the correlation in the Lya forest without los- 
ing too much resolution. In the following sections we discuss 
more sophisticated methods of accounting for the correlations 
between the data points. 

Let us now compare the same sight line with and without 
the signature of the proximity effect. These two spectra have 
the same original hydrogen distribution along the line of sight, 
thus the two hydrogen densities, or, ignoring the peculiar ve- 
locities, the two optical depths, are related by Eq. [8] Our 
aim is to express the observed flux probability density P{F) 
in Eq.[T5]as a function of the strength of the proximity effect 
and the flux probability unaffected by the quasar radiation. 

The FPD affected (P,„) and unaffected (Poo as presented in 
Fig. m by the quasar radiation are related by 



Pm(Ffn)^FfYi — Poc(.Fq^^(1Fq^ 



Knowing that 



F„ = e-"'=exp(--^l=F^-- 



we can write 



p„,(F,„)=Poo(F,;;'^)(i+uj)F,'; 



(16) 



(17) 



(18) 



The Likelihood function in Eq. [15] can be generalized in the 
presence of instrumental noise to 

C = Y[ ^ — '-^P,r,(F')dF' (19) 



awl 21: 



where the additional exponential term describes the Gaussian 
noise with the proper normalization. Inserting Eq. [18] into 
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Figure 10. The proximity effect strength distributio n in thi'ee different sets of 500 sight lines at redshift z = 2.25, 3.0, and 4.0. The PESD has been constructed 
adopting the likelihood technique described in Sect. |5.3| to estimate the strength of the proximity effect. The extent of the biases, represented by the shift of the 
mode in the PESD with respect to the dashed vertical line, remains constant with redshift. 
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Figure 11. The flux auto-correlation function in one Lyo forest spectrum 
(solid line) in comparison with two different non-trivial weighting schemes 
(long and short dashed lines). The introduction of a weighting scheme in 
the computation of Corr(Av) significantly reduces the auto-correlation of the 
transmitted flux. 

Eq. [T9]we obtain 

^^T-r /-' exp[-ffi,„,-F0V2af] 

I Jl) cr,V27r 

X Poo (f"-"") (l+uj)F"^dF'. (20) 

This function has only one free parameter, the strength of the 
proximity effect For an infinitely high signal to noise 

ratio, the Gaussian becomes a delta function and the expres- 
sion under the integral reduces to the flux probability distri- 
bution given by Eq. [18] 

The methodological approach is then straightforward: (i) 
we introduce the proximity effect on the hydrogen neutral 
fraction along all the lines of sight at our disposal, (ii) we 
compute the likelihood function for a set of values 
within the range -1 < logw^'^^ < 1 and finally (iii) we search 
for the particular value of w^'^^ that maximizes Eq. |20] For 
illustration purposes we present in Fig.|9]the likelihood func- 
tion for two different lines of sight where the two maxima are 
located in different positions with respect to the input model. 
After repeating this procedure over all spectra we construct 
the PESDs presented in Fig. [TO] 

At all redshifts the inferred PESD has a clear maximum, 
however this maximum does not coincide with the input 
model, moreover the modal values of all PESDs are biased 
towards smaller w^"^^. Contrary to the outcome of the simple 
averaging technique, this approach fails to recover the input 
model and also the inferred PESD is clearly broader Several 
factors may cause this bias. 



(i) Our spectra might still be significantly affected by intrin- 
sic (as opposed to thermal or instrumental broadening) corre- 
lations in the Lya forest. Even when we re-bin the spectrum 
to significantly reduce the correlations between nearby pix- 
els, the intrinsic correlations between close absorbers largely 
remain. For this reason we recomputed the PESD after re- 
binning the spectrum over several tens of km s~', up to 100 
km s"' . Unfortunately this had no effect neither on the modal 
value of the PESD nor on its shape, demonstrating that it is in- 
trinsic correlations between absorbers and not thermal broad- 
ening that is responsible for the biased result of Fig.fTOl 

(ii) The flux probability distribution might change signifi- 
cantly along different lines of sight, thus concealing some un- 
controlled systematic effect when assuming as common FPD 
the average over all sight lines. Therefore we have repeated 
our computation adopting the FPD estimated from the same 
line of sight without the influence of the quasar Such a proce- 
dure is, of course, not feasible for real observations, but, nev- 
ertheless, it does not solve the problem of a biased PESD. We 
have finally tried t o analytically fit the average or single FPD 
with different fits ( Miralda-Escude et al.l 120001: iBecker et all 
120071) also without success. 

We conclude that the reason for the bias in the maximum 
likelihood analysis is caused by an intrinsic correlation in the 
Lya forest not being accurately accounted for by our simple 
re-binning procedure. In the following, we attempt to solve 
this problem by estimating the correlation function in our sim- 
ulated spectra. 

5.4. The correlation function 

We showed in the previous section that clustering of Lya 
absorbers gives rise to correlations in the observed transmit- 
ted flux large enough to heavily bias the results of a maximum 
likelihood analysis, even after re-binning the simulated spec- 
trum. We now focus on measuring how large these correla- 
tions are by means of the correlation function. 

Given a point in redshift with transmitted flux F, the cor- 
relation function describes the probability of finding another 
point, with the same F, within a given redshift interval. More 
precisely, if we express the interval with a velocity shift Av 
we can write that 

Corr(Av) = {5Fiv)5Fiv + llv)) /F^ (21) 

where F represents the mean flux and 5F{v) = F{v)-F. The 
numerical value of Corr(Av) is obtained by directly averag- 
ing individual pixels over the spectrum, separated by a given 
velocity Av. 
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Figure 12. The proximity effect strength distribution in three different sets of 500 sight lines at redshift z — 2.25, 3.0, and 4.0. The PESD has been constructed 
with a modified likelihood approach, which used a weighted FPD to account for the intrinsic correlations in the Lya forest. The PESD remains biased with 
respect to the reference model and the amount of bias is now strongly redshift dependent. 



To illustrate the properties of the correlation function, we 
computed Corr(Av) for one sight line and show the result in 
Fig, im The amplitude of the correlation increases signifi- 
cantly for separations smaller than about Av < 100 km s"', 
while it fluctuates around zero for separations larger than a 
few hundreds of km s"' . This directly shows that individual 
pixels in the Lya forest are not independent from an other, but 
strongly correlated. Such a correlation, which also changes 
from one sight line to another, does not vanish after a simple 
re-binning of the spectrum. 

Motivated by the lack of success in our previous method, 
we explore a different approach to remove the signature of 
correlated pixels. We introduce a weighting scheme in the 
definition of the correlation function designed to give negli- 
gible weight to correlated pixels in the Lya forest. Adopting 
this weight to estimate a new flux probability distribution, we 
would immediately remove the imprints of the correlation. 

If we introduce such a weighting scheme, Eq.l^Tlbecomes 



Corr( Av, w) = 



{F(v))jF{v + Av)\; 



(22) 



For this purpose, we explored two types of weighting func- 
tions: wi(F) = F, which removes the correlations for strong 
absorbers and, W2{F) = F^(l - F)^ , which accounts for the cor- 
relation of strong and weak absorbers. With our new defini- 
tion of the correlation function, we recomputed Corr(Av,w) 
for the same sight line as before and place our results into 
context in Fig.[TT| While already the first weight significantly 
reduces the correlations, the second one removes the intrinsic 
correlations of the the Lya flux almost completely. 

We then adopted W2iF) to recompute the FPD which will 
now have a different shape with respect to that of Fig.|4] and 
will show one pronounced peak for < F < I. This new 
weighted probability distribution is used to infer the likeli- 
hood function following the same procedure as in Sect. 15.31 
From the most likely values of the proximity effect strength, 
we reconstructed the PESDs which are now presented in 
Fig. [121 With this new approach, all the inferred distribu- 
tions not only present a significant bias with respect to the 
input model, but this bias additionally changes from an un- 
derestimation to an overestimation as the snapshot redshift 
decreases. In spite of the complexity of this new method, 
there are still uncontrolled systematics in the analysis of the 
proximity effect which are not correctly accounted for, even 
introducing a weighting scheme. 



5.5. Sampling the Lya forest for the likelihood 

None of the techniques presented so far performs better 
than the simple averaging technique in recovering the signa- 
ture of the proximity effect. Our next attempt to overcome the 
imprints of the mentioned correlations is based on the compu- 
tation of a different likelihood function. 

Until now we have proceeded with the computation of C 
following Eq. |20l where the product is performed consider- 
ing all the flux pixels in the spectrum. Due to the absorption 
correlations and our difficulties in removing their signature, 
we now try to apply a selection of the pixels from which the 
product will be estimated. If we consider a set of ; flux pix- 
els separated by a few thousands of km s"' ( Av), these points 
will be uncorrected according to Fig.[TT] From this set of flux 
values we can estimate one likelihood function before consid- 
ering to the next set of ; flux pixels. 

Depending on the resolution of our synthetic spectra (dv), 
we will have a set of j likelihood functions where the exact 
number is defined as j = Av/dv. Each likelihood function 
will then be maximized and yield one value of w°y^. 

The distribution of w^y^ depends on how many points con- 
tribute to the particular set and behaves as follows: increas- 
ing the pixel separation Av, the number of pixels from which 
the likelihood is estimated decreases, thus yielding a broader 
distribution of t^^^y^. Equivalently, if the pixel separation is 
too small, the influence of the correlation between pixels in- 
creases, resulting is a biased result. We fix our separation to 
Av = 2000 km s~' and adopt the mean ^jJ^^"^ as a proxy for the 
most likely indicator of the proximity effect strength. 

Figure[l3]presents the results showing the PESDs at differ- 
ent redshifts. In our highest redshift snapshot the modal value 
of the PESD is, given the uncertainties, extremely close to the 
input model (offset by 1 Aa) with a dispersion in the strength 
parameter significantly smaller than that of the simple averag- 
ing technique. However, towards lower redshift, a significant 
bias in the modal values appears again. We conclude that this 
approach is not superior to the simple averaging technique. 

Even if we could find two new pixel separations Av at red- 
shifts z = 2.25 and 3.0 which yield no biases in the recovery 
of the reference model, we are aware of the drawbacks that 
such approach would have on real data. The biases that are 
in this method due to a wrong choice of the sampling size are 
extremely difficult to control because in real spectra we typi- 
cally lack of the spectral informations without the quasar im- 
pact. At best, we therefore could only guess the appropriate 
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Figure 13. The proximity effect strength distribution in three different sets of 500 sight lines at redshift z = 2.25, 3.0, and 4.0. The PESD has been constructed 
employing a modified likelihood approach which adopted the unweighted FPD but samples the spectrum over 2000 km s"' to account for the absorption 
correlations in the Lyo forest. The PESD remains biased with respect to the reference model and the amount of bias is now strongly dependent on redshift. 



-0.5 



2.5 



3.5 



Figure 14. Comparison of the best estimate of the proximity effect strength 
obtained with the different methods presented in this work. The black circles 
show the outcome of the simple averaging technique. The squares, the crosses 
and the triangles depict the simple likelihood, the likelihood with a weighted 
FPD, and the sampled likelihood, respectively. A small redshift shift has been 
applied to the data points to make them more easily recognizable. The modal 
values and associated errors have been estimated with a bootstrap technique 
as described in Sect. |5] 



sampling size via numerical simulations, without being able 
to test the accuracy. 

We summaries the results of all the techniques presented 
in this work in Fig. [14] None of the more maximum like- 
lihood methods are capable of yielding tighter and unbiased 
constraints on the proximity effect than the simple averaging 
technique. 

6. CONCLUSIONS 

We have analyzed a set of high-resolution, three- 
dimensional numerical simulations with a Hydro-Particle- 
Mesh code. We evolved the particle distribution in the sim- 
ulated box until a redshift of z = 2.25, and recorded seven 
snapshots within the range 2.25 < z < 4. For each snapshot 
we have drawn 500 randomly distributed sight lines through 
the simulated box, obtaining simulated spectra of the Lya for- 
est. 

A sample of 40 high-resolution, high-S/N quasar spectra, 
with emission redshifts within the range 2.1 < z < 4.7, has 
been used to calibrate the simulated spectra. We have com- 
puted from the simulated sight lines (i) the evolution of the 
effective optical depth, (ii) the flux probability distribution 
function, and (iii) the column density distribution at differ- 
ent redshifts. While the computation of the synthetic line of 
sight depends on several free parameters, we have tuned them 
to be consistent (within the measured uncertainties) with the 
observational data in all three measurements. 



Our study is focused on developing and testing new tech- 
niques of recovering the strength of the proximity effect along 
individual sight lines. Our analysis has begun with a compar- 
ison between the widely adopted combined analysis of the 
proximity effect signal over multiple lines of sight, with the 
recently developed technique of estimating its strength on in- 
dividual quasar spectra. We refer to this method the simple 
averaging technique. As the strength distribution is supposed 
to be asymmetric, biases are expected to arise when determin- 
ing the combined proximity effect signal. 

We have confirmed, with a realistic set of synthetic lines of 
sight drawn from our numerical simulation, the existence of 
this biases, albeit with a different intensity as predicted with 
Monte Carlo simulations. We have concluded that the smaller 
bias is caused by the smaller scatter of the strength parame- 
ter. We have confirmed that the modal value, or peak of the 
proximity effect strength distribution (PESD), yields an un- 
biased estimate of the input parameters used to compute the 
proximity effect. Moreover, we have detected the expected 
broadening in the shape of the PESD towards low redshift, as 
predicted in Paper 11 using Monte Carlo simulations. 

In principle, the simple averaging technique, by combin- 
ing observed pixels together, loses information. In order to 
avoid this loss, we have investigated several incarnations of 
a maximum likelihood approach. The first incarnation was a 
standard implementation of the likelihood function. Due to 
intrinsic (as opposed to thermal and instrumental broadening) 
auto-correlation of the transmitted flux along a single line of 
sight, this technique was subject of systematic bias at all red- 
shifts and for all models of the flux probability distribution. 

In the second method we have used a weighting scheme, 
designed to reduce the intrinsic auto-correlation in the absorp- 
tion spectrum. While this weighting scheme was able to sub- 
stantially reduce the two-point autocorrelation functions of 
the flux, the resultant PESDs were significantly biased. This 
failure of the weighting scheme indicates that it is not a two- 
point, but some (currently unknown) higher order correlation 
function(s) that are primarily responsible for the bias in the 
maximum likelihood estimate of the proximity effect. 

In an attempt to reduce the bias, we have adopted a sam- 
pling approach of widely separated flux points in the spectrum 
to design a more complex likelihood function. While this ap- 
proach yielded a substantially more accurate estimate of the 
best-fit value, the value itself remained biased. That bias is 
comparable to the statistical uncertainty of the measurement 
at redshift z = 4.0, but becomes progressively larger towards 
lower redshifts. 
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Thus, the newly introduced simple averaging technique, de- 
spite of the perceived loss of information during the averaging 
procedure, is the only method of estimating the proximity ef- 
fect signal free of biases. 
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